A comparison of training algorithms for DHP adaptive critic neurocontrol
نویسندگان
چکیده
A variety of alternate training strategies for implementing the Dual Heuristic Programming (DHP) method of approximate dynamic programming in the neuro-control context are explored. The DHP method of controller training has been successfully demonstrated by a number of authors on a variety of control problems in recent years, but no unified view of the implementation details of the method has yet emerged. A number of options are here described for sequencing the training of the Controller and Critic networks in DHP implementations. Results are given about their relative efficiency and the quality of the resulting controllers for two benchmark control problems.
منابع مشابه
Qualitative Models for Adaptive Critic Neurocontrol
We demonstrate the use of qualitative models in the DHP method of training neurocontrollers. Two Fuzzy approaches to developing qualitative models are explored: a priori application of problem specific knowledge, and estimation of a first order TSK Fuzzy model. These approaches are demonstrated respectively on the cart-pole system and a non-linear multiple-inputmultiple-output plant proposed by...
متن کاملAdaptive Critic Based Approximate Dynamic Programming for Tuning Fuzzy Controllers
This work was supported by the National Science Foundation under grant ECS-9904378. Abstract: In this paper we show the applicability of the Dual Heuristic Programming (DHP) method of Approximate Dynamic Programming to parameter tuning of a fuzzy control system. DHP and related techniques have been developed in the neurocontrol context but can be equally productive when used with fuzzy controll...
متن کاملPartial, noisy and qualitative models for adaptive critic based neurocontrol
The roles of plant models in adaptive critic methods for approximate dynamic programming are considered, with primary focus given to the DHP methodology. In place of complete system identification, partial, approximate, and qualitative models of plant dynamics are considered. Such models are found to be sufficient for successful controller design. As classification is in general easier than reg...
متن کاملModel-Based Adaptive Critic Designs
Editor’s Summary: This chapter provides an overview of model-based adaptive critic designs, including background, general algorithms, implementations, and comparisons. The authors begin by introducing the mathematical background of model-reference adaptive critic designs. Various ADP designs such as Heuristic Dynamic Programming (HDP), Dual HDP (DHP), Globalized DHP (GDHP), and Action-Dependent...
متن کاملAdaptive Critic Designs - Neural Networks, IEEE Transactions on
We discuss a variety of adaptive critic designs (ACD’s) for neurocontrol. These are suitable for learning in noisy, nonlinear, and nonstationary environments. They have common roots as generalizations of dynamic programming for neural reinforcement learning approaches. Our discussion of these origins leads to an explanation of three design families: Heuristic dynamic programming (HDP), dual heu...
متن کامل